HapCUT: an efficient and accurate algorithm for the haplotype assembly problem

نویسندگان

  • Vikas Bansal
  • Vineet Bafna
چکیده

MOTIVATION The goal of the haplotype assembly problem is to reconstruct the two haplotypes (chromosomes) for an individual using a mix of sequenced fragments from the two chromosomes. This problem has been shown to be computationally intractable for various optimization criteria. Polynomial time algorithms have been proposed for restricted versions of the problem. In this article, we consider the haplotype assembly problem in the most general setting, i.e. fragments of any length and with an arbitrary number of gaps. RESULTS We describe a novel combinatorial approach for the haplotype assembly problem based on computing max-cuts in certain graphs derived from the sequenced fragments. Levy et al. have sequenced the complete genome of a human individual and used a greedy heuristic to assemble the haplotypes for this individual. We have applied our method HapCUTto infer haplotypes from this data and demonstrate that the haplotypes inferred using HapCUT are significantly more accurate (20-25% lower maximum error correction scores for all chromosomes) than the greedy heuristic and a previously published method, Fast Hare. We also describe a maximum likelihood based estimator of the absolute accuracy of the sequence-based haplotypes using population haplotypes from the International HapMap project. AVAILABILITY A program implementing HapCUT is available on request.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HapCompass: A Fast Cycle Basis Algorithm for Accurate Haplotype Assembly of Sequence Data

Genome assembly methods produce haplotype phase ambiguous assemblies due to limitations in current sequencing technologies. Determining the haplotype phase of an individual is computationally challenging and experimentally expensive. However, haplotype phase information is crucial in many bioinformatics workflows such as genetic association studies and genomic imputation. Current computational ...

متن کامل

A Hybrid Unconscious Search Algorithm for Mixed-model Assembly Line Balancing Problem with SDST, Parallel Workstation and Learning Effect

Due to the variety of products, simultaneous production of different models has an important role in production systems. Moreover, considering the realistic constraints in designing production lines attracted a lot of attentions in recent researches. Since the assembly line balancing problem is NP-hard, efficient methods are needed to solve this kind of problems. In this study, a new hybrid met...

متن کامل

A Multi-Objective Particle Swarm Optimization for Mixed-Model Assembly Line Balancing with Different Skilled Workers

This paper presents a multi-objective Particle Swarm Optimization (PSO) algorithm for worker assignment and mixed-model assembly line balancing problem when task times depend on the worker’s skill level. The objectives of this model are minimization of the number of stations (equivalent to the maximization of the weighted line efficiency), minimization of the weighted smoothness index and minim...

متن کامل

Simultaneous Multi-Skilled Worker Assignment and Mixed-Model Two-Sided Assembly Line Balancing

This paper addresses a multi-objective mathematical model for the mixed-model two-sided assembly line balancing and worker assignment with different skills. In this problem, the operation time of each task is dependent on the skill of the worker. The following objective functions are considered in the mathematical model: (1) minimizing the number of mated-stations (2), minimizing the number of ...

متن کامل

An Analytical Approach for Single and Mixed-Model Assembly Line Rebalancing and Worker Assignment Problem

In this paper, an analytical approach is used for assembly line rebalancing and worker assignment for single and mixed-model assembly lines based on a heuristic-simulation algorithm. This approach helps to managers to select a better marketing strategy when different combinations of demands are suitable.Furthermore, they can use it as a guideline to know which worker assignment is better for ea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 24 16  شماره 

صفحات  -

تاریخ انتشار 2008